Introduction

AirBnB

The data for AirBnB contains detailed information on all of the approximately fifty thousand private apartment listings for rent through the site in New York City. The data is for a time before the COVID pandemic because I wanted to remove that additional complication for data analysis.

Selected Variables:

  • id: ID number of the listing
  • transit: description of the transit options
  • host_id: a unique ID for the host
  • host_listings_count: how many places does the host rent out?
  • latitude/longitude: the Geo coordinates
  • room_type / accommodates / bathrooms / bedrooms: some info about the place, like number of bed- and bathrooms, whether it is shared etc.
  • price: nightly price
  • availability_365: What part of the year is the property available to be rented?
  • Number of reviews / Review scores

There are many other variables included, many of which are not needed and discarded.

Maps

There are three types of maps:

  • a vector map of NYC boroughs, as we used them lecture
  • a map of NYC neighborhoods
  • a map of subway lines and stations

Maps: Neighborhoods of NYC

The file neighbourhoods.geojson is a GeoJSON file of NYC neighborhoods.

Importing datasets, spatial objects and cleaning

To start off, the neighborhood and borough geospatial objects, as well as the data frame that contains the airbnb listings, are imported. Required columns are selected and the price columns are cleaned, such that they are numeric. Monthly prices are imputed (using nightly price multiplied by 30) if they are not given.

## OGR data source with driver: ESRI Shapefile 
## Source: "C:\Users\User\Desktop\work_samples\Data_Visualization\Airbnbs_in_NYC\data\nyc_boroughs_map\nybb.shp", layer: "nybb"
## with 5 features
## It has 4 fields

Question 1 - Overall Location

  1. Provide a map to show where in New York City AirBnB listings are located.

Answer: First the borough polygons and neighborhood outlines are overlapped with the NYC map tiles to give a base map for all of our visualizations. Each borough and neighborhood is labelled accordingly.

Then markers for each listing are added using the coordinates in the airbnb data frame. A few important pieces of information are provided as labels for each listing, including price, room type and their zipcode.

When zoomed into each of the five boroughs, it can be seen that listings are available across the entirety of Manhattan Island. In the other boroughs, they are more clustered at northwestern neighbourhoods of Queens (especially Astoria), as well as the northern neighbourhoods of Brooklyn (Greenpoint, Bushwick, Bedford-Stuyvesant, Williamsburg).

  1. Provide a map in which you summarize the density of the AirBnB listings and highlight the hot-spots for AirBnB locations. Make sure to annotate a few hot-spots on the map.

Answer: ggplot2 is used to overlay the borough polygons and neighbourhood borders to create a base map, then the density heat map of airbnb listings is overlaid on top. It turns out that regions with the densest Airbnb listings are located at Manhattan island and Northern Brooklyn.

When zoomed into the regions of interest, the hotspots with the densest Airbnb listings (per unit sq of the longitude-latitude grid) turns out to be Hell’s Kitchen, East Village and Lower East Side in Manhattan, as well as Williamsburg, Bushwick and Bedford-Stuyvesant in Brooklyn.

As an alternative to visualizing clusters of Airbnb listings, here is an interactive map where listings are clustered by proximity, and clicking onto each cluster zooms into each region, and spiders out into smaller clusters.

Hovering over each cluster shows the number of listings, as well as the borders of the region of interest.

Question 2 - Renting out your apartment vs. permanent rentals

An Airbnb host can set up a calendar for their listing so that it is only available for a few days or weeks a year. Other listings are available all year round (except for when it is already booked). Entire homes or apartments highly available and rented frequently year-round to tourists probably don’t have the owner present, are illegal, and more importantly, are displacing New Yorkers.

Hint: The variable availability_365: What part of the year is the property available to be rented? is a possible choice to categorize rentals.

Question 2a)

Choose a combination of both maps and non-mapping visualizations (graphs or tables) to explore where in NYC listings are available sporadically vs. year-round. Make sure to highlight the neighborhoods where most listings appear to be permanent or semi-permanent rentals.

Answer: The availability_365 varaible records how many days out of a year is a listing available for. Making use of the availability_365 variable, each listing is sorted into one of the following 5 categories of availability, as availability_factor:

  1. Less than 30 days
  2. 1-4 months
  3. 4-9 months
  4. Majority of the year (>9 months but not all year round)
  5. All-year-round

Category 1 and 2 would most likely be rentals with long-term renters/mostly inhabited by the host, but let out sporadically during summer/holiday seasons when the long-term renters or host leave the city.

Category 3 would mostly likely be rentals where the host uses the property as short-term temporary abode some times in the year (eg. a holiday home). They would be rented out sporadically as Airbnbs’ when they are not needed by the host.

Category 4 and 5 would be rentals that are highly available and rented out year-round to tourists probably don’t have the owner present, are illegal, and more importantly, possibly displacing New Yorkers.

Each type of availability is summarized by borough, in the datatable below:

The table is visualized as the barplot below:

For all boroughs, all year round rentals are the least prevalent type of listing.

Most listings in Manhattan and Brooklyn are available for less than a month, but there are around 15-17% of listings that are available most of the year.

For Staten Island and the Bronx, the most prevalent kind of listings are available for the majority of the year. They have a lower proportion of short-term listings compared to Manhattan and Brooklyn.

The availability of listings in Queens are comparatively more evenly spread out. About 1/3 of listings are short-term, and approximately 1/4 of listings fall into each of the 3 intermediate to long-term cateogeries.

This is a table summarizing boroughs and top 30 neighbourhoods with the most highly available (>9 months in a year) Airbnb listings. They are mostly neighbourhoods in Manhattan and Brooklyn (Bedford-Stuyvesant, Hell’s Kitchen, Williamsburg etc.).

Again, the bar plot below shows the frequency of the semi-permanent/permanent listings in each borough.

This adds up to 9840 listings across the 5 boroughs that are rented out year-round to tourists, and are possibly displacing New Yorkers.

This is a table summarizing boroughs and top 30 neighbourhoods with the most sporadically available (<9 months in a year) Airbnb listings. They are again mostly neighbourhoods in Manhattan and Brooklyn, and coincide with the airbnb hotspots identified as before (Williamsburg, Bedford-Stuyvesant, Bushwick, East Village).

The barplot below shows the number of sporadically available listings across the 5 NYC boroughs. This adds up to 38961 listings which periodically rotates between hosting tourists/homeowners/long and short-term local renters.

The proportion of semi-permanently/permanently available listings in each neighbourhood are summarized in this table:

Looking at the top 10 Airbnb hotspots, the proportion of semi-permanently/ permanently available listings in each neighbourhood can range widely from 11% - 46%. Nonetheless this adds up to a large number of properties that have been sequestered from the local rental market.

The proportions of semi-permanently/permanently available listings in each neighbourhood are also visualized in the choropleth map below:

Neighbourhoods in Upper Manhattan, Northern Brooklyn and Western Queens have <=20% of rentals that are semi-permanent/permanent Airbnb listings (light yellow). However, some of these neighbourhoods have thousands of listings, so a lower percentage still means hundreds of rentals in each neighbourhood that are not available to New Yorkers.

The tourist hotspot neighbourhoods in Mid to Downtown Manhattan have 20 - 50% of rentals that are semi-permanent/permanent Airbnb listings (light orange). Most of the neighbourhoods in all other boroughs also fall into this range. These neighbourhoods often have hundreds of listings, a substantial portion of which are sequestered from the New York local renters.

Most of the neighbourhoods with >50% of rentals that are semi-permanent/ permanent Airbnb listings (orange, dark orange and red) only have a small number of listings, so the high percentages probably do not have significant implications about displacing New York locals. An exception is the neighbourhood of Canarsie in Southeastern Brooklyn, which has 149 listings, and half of them are long-term Airbnb listings.

Question 2b)

Some hosts (identified by host_id) operate multiple rentals. Provide a data table of the top hosts, the total number of listings they are associated with, the average nightly price, and the estimated average monthly total income from these listings.

Answer: Hosts are identified by the unique host ID, and the number of listings, average nighly price and estimated average monthly total income (assuming all rentals are rented out for a month) are calculated for each host. The top 10 hosts operating the most Airbnb rentals are presented as follows:

We can see from the host names that the top 10 hosts operating the most Airbnb rentals are a mix of companies and individuals. Companies could be property management agencies or corporate landlords holding multiple properties across NYC, and providing furnished, luxury apartments. Examples include Sonder, Blueground, and Airbnb’s Corporate Housing catalogue.

Even if the host is listed as individuals/couples, they could still be small-scale management teams hosting furnished rentals and living spaces. For example Kazuya.

Question 3 - Most Expensive and Top Reviewed Rentals

Provide an interactive map which shows the Top 100 most expensive and Top 100 best reviewed rentals in NYC. The map should differentiate these two groups and upon clicking on a point on the map should show some basic information (at least 3 pieces of information) in a tool tip.

Answer: First all rentals are ordered by price and review score, using the price, monthly_price and review_scores_rating, number_of_reviews variables respectively. Then the top 100 most expensive and 100 best reviewed rentals are extracted as two separate data frames.

They are then layered into the interactive base map, and are labelled with their ranking, prices, the type of room, guest capacity and zipcode.

Both most expensive and top reviewed rentals are mostly in Manhattan and North/ Northwestern Brooklyn.

The top 100 most expensive rentals range from $2000 - $10,000 per night. The 100 best reviewed rentals all have a mean review score of 100, based on an average of 91.88 reviews.